Machine Learning review and intro to tidymodels
- Read about the hotel booking data,
hotels, on the Tidy Tuesday page it came from. There is also a link to an article from the original authors. The outcome we will be predicting is called is_canceled.
babies, is_repeated_guest, previous_cancellations, and deposit_type might be predictive variables. With babies, it is possible to have a lot of emergencies going on, and thus plans could be changing constantly. Being a repeated guest may lead to a lower likelihood of cancellation because they have already known about the hotel. previous_cancellations is a good predictive to tell the general habit of the guest. With the deposit, it would be less likely for guests to cancel in order not to lose their money.
Some of the variables were engineered from other variables from different database tables.
We will be able to know which are the most important variables that can predict is_cancaled and how they affect it.
- Create some exploratory plots or table summaries of the variables in the dataset. Be sure to also examine missing values or other interesting values. You may want to adjust the
fig.width and fig.height in the code chunk options.
hotels %>%
select(where(is.numeric)) %>%
pivot_longer(cols = everything(),
names_to = "variable",
values_to = "value") %>%
ggplot(aes(x = value)) +
geom_histogram(bins = 30) +
facet_wrap(vars(variable),
scales = "free",
nrow = 5)

hotels %>%
mutate(across(where(is.character), as.factor)) %>%
select(where(is.factor)) %>%
pivot_longer(cols = everything(),
names_to = "variable",
values_to = "value") %>%
ggplot(aes(x = value)) +
geom_bar() +
facet_wrap(vars(variable),
scales = "free",
nrow = 4)

hotels %>%
add_n_miss() %>%
count(n_miss_all)
- First, we will do a couple things to get the data ready.
hotels_mod <- hotels %>%
mutate(is_canceled = as.factor(is_canceled)) %>%
mutate(across(where(is.character), as.factor)) %>%
select(-arrival_date_year,
-reservation_status,
-reservation_status_date) %>%
add_n_miss() %>%
filter(n_miss_all == 0) %>%
select(-n_miss_all)
set.seed(494)
hotel_split <- initial_split(hotels_mod, prop = .5, strata = is_canceled)
hotel_training <- training(hotel_split)
hotel_testing <- testing(hotel_split)
- In this next step, we are going to do the pre-processing. Usually, I won’t tell you exactly what to do here, but for your first exercise, I’ll tell you the steps.
hotel_recipe <- recipe(is_canceled ~ ., data = hotel_training) %>%
step_mutate(has_child = as.factor(as.numeric(children > 0)),
has_baby = as.factor(as.numeric(babies > 0)),
has_precancel = as.factor(as.numeric(previous_cancellations > 0)),
has_agent = as.factor(as.numeric(agent == 'NULL')),
has_company = as.factor(as.numeric(company == 'NULL')),
country = fct_lump_n(country, 5)) %>%
step_rm(children,
babies,
previous_cancellations,
agent,
company) %>%
step_normalize(all_predictors(),
-all_nominal()) %>%
step_dummy(all_nominal(),
-all_outcomes())
hotel_recipe %>%
prep(hotel_training) %>%
juice()
- In this step we will set up a LASSO model and workflow.
LASSO shrinks some of the coefficients of variables to 0 so that we don’t have too many predictors in the model. In this case, we have almost 30 predictors, which is a lot, so it would be great to get rid of some of them to avoid overfitting.
hotel_lasso_mod <-
logistic_reg(mixture = 1) %>%
set_engine("glmnet") %>%
set_args(penalty = tune()) %>%
set_mode("classification")
hotel_lasso_wf <-
workflow() %>%
add_recipe(hotel_recipe) %>%
add_model(hotel_lasso_mod)
hotel_lasso_wf
## == Workflow ====================================================================
## Preprocessor: Recipe
## Model: logistic_reg()
##
## -- Preprocessor ----------------------------------------------------------------
## 4 Recipe Steps
##
## * step_mutate()
## * step_rm()
## * step_normalize()
## * step_dummy()
##
## -- Model -----------------------------------------------------------------------
## Logistic Regression Model Specification (classification)
##
## Main Arguments:
## penalty = tune()
## mixture = 1
##
## Computational engine: glmnet
- In this step, we’ll tune the model and fit the model using the best tuning parameter to the entire training dataset.
set.seed(494) # for reproducibility
hotel_cv <- vfold_cv(hotel_training, v = 5)
penalty_grid <- grid_regular(penalty(),
levels = 10)
penalty_grid
hotel_lasso_tune <-
hotel_lasso_wf %>%
tune_grid(
resamples = hotel_cv,
grid = penalty_grid
)
hotel_lasso_tune
hotel_lasso_tune %>%
collect_metrics() %>%
filter(.metric == "accuracy")
hotel_lasso_tune %>%
collect_metrics() %>%
filter(.metric == "accuracy") %>%
ggplot(aes(x = penalty, y = mean)) +
geom_point() +
geom_line() +
scale_x_log10(breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10",scales::math_format(10^.x))) +
labs(x = "penalty", y = "accuracy")

best_param <- hotel_lasso_tune %>%
select_best(metric = "accuracy")
hotel_lasso_final_wf <- hotel_lasso_wf %>%
finalize_workflow(best_param)
hotel_lasso_final_mod <- hotel_lasso_final_wf %>%
fit(data = hotel_training)
hotel_lasso_final_mod %>%
pull_workflow_fit() %>%
tidy()
arrival_date_month_September, market_segment_Groups, market_segment_Undefined, distribution_channel_Undefined, and assigned_room_type_L have coefficients of 0.
- Now that we have a model, let’s evaluate it a bit more. All we have looked at so far is the cross-validated accuracy from the previous step.
hotel_lasso_final_mod %>%
pull_workflow_fit() %>%
vip()

reserved_room_type_P is the most important variable, following by deposit_type_Non.Refund and has_precencel_X1. I’m not very surprised because reserved room type may tell approximately how many guests there are and what their purpose of the reservation is, which could determine the likihood of cancelling the event.
hotel_lasso_test <- hotel_lasso_final_wf %>%
last_fit(hotel_split)
hotel_lasso_test %>%
collect_metrics()
preds <- collect_predictions(hotel_lasso_test)
conf_mat(preds, is_canceled, .pred_class)
## Truth
## Prediction 0 1
## 0 34179 7777
## 1 3404 14333
The test metric is slightly lower than the cross-validated one, but they are pretty close to each other.
# True positive
14333/(14333+7777)
## [1] 0.6482587
# True negative
34179/(34179+3404)
## [1] 0.9094271
# Accuracy
(34179+14333)/(34179+14333+7777+3404)
## [1] 0.8126916
preds %>%
ggplot(aes(x = .pred_1, fill = is_canceled)) +
geom_density(alpha = 0.5, color = NA)

- What would this graph look like for a model with an accuracy that was close to 1?
If the accuracy is close to 1, the red part would mostly has a .pred_1 < 0.5 and the blue part would mostly has a .pred_1 > 0.5.
- Our predictions are classified as canceled if their predicted probability of canceling is greater than .5. If we wanted to have a high true positive rate, should we make the cutoff for predicted as canceled higher or lower than .5?
Lower than 0.5.
- What happens to the true negative rate if we try to get a higher true positive rate?
It will be lower.
- Let’s say that this model is going to be applied to bookings 14 days in advance of their arrival at each hotel, and someone who works for the hotel will make a phone call to the person who made the booking. During this phone call, they will try to assure that the person will be keeping their reservation or that they will be canceling in which case they can do that now and still have time to fill the room. How should the hotel go about deciding who to call? How could they measure whether it was worth the effort to do the calling? Can you think of another way they might use the model?
The hotel should call guests with reserved room type P because it is the variable that influences the outcome the most. To measure whether it was worth the effort, it could be helpful to look at other important variables such as deposit type, previous cancellation, and assigned room type which make guests more likely to cancel. They can also use the model to improve their reservation system. For example, if they make all deposits non-refundable to lower the likelihood of cancellation.
- How might you go about questioning and evaluating the model in terms of fairness? Are there any questions you would like to ask of the people who collected the data?
I would like to learn the proportion of young and old people and the proportion of different gender/race groups to check if the data underrepresent any group. I would also like to ask what they think are the most important variables influencing cancellation because their primary knowledge may lead to a biased collection of data.
LS0tDQp0aXRsZTogJ0Fzc2lnbm1lbnQgIzInDQphdXRob3I6ICdZdW55YW5nIFpob25nJw0Kb3V0cHV0OiANCiAgaHRtbF9kb2N1bWVudDoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICBkZl9wcmludDogcGFnZWQNCiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlDQotLS0NCg0KYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9DQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0UpDQpgYGANCg0KYGBge3IgbGlicmFyaWVzfQ0KbGlicmFyeSh0aWR5dmVyc2UpICAgICAgICAgIyBmb3IgZ3JhcGhpbmcgYW5kIGRhdGEgY2xlYW5pbmcNCmxpYnJhcnkodGlkeW1vZGVscykgICAgICAgICMgZm9yIG1vZGVsaW5nDQpsaWJyYXJ5KG5hbmlhcikgICAgICAgICAgICAjIGZvciBhbmFseXppbmcgbWlzc2luZyB2YWx1ZXMNCmxpYnJhcnkodmlwKSAgICAgICAgICAgICAgICMgZm9yIHZhcmlhYmxlIGltcG9ydGFuY2UgcGxvdHMNCnRoZW1lX3NldCh0aGVtZV9taW5pbWFsKCkpICMgTGlzYSdzIGZhdm9yaXRlIHRoZW1lDQpgYGANCg0KYGBge3IgZGF0YX0NCmhvdGVscyA8LSByZWFkcjo6cmVhZF9jc3YoJ2h0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9yZm9yZGF0YXNjaWVuY2UvdGlkeXR1ZXNkYXkvbWFzdGVyL2RhdGEvMjAyMC8yMDIwLTAyLTExL2hvdGVscy5jc3YnKQ0KYGBgDQoNCldoZW4geW91IGZpbmlzaCB0aGUgYXNzaWdubWVudCwgcmVtb3ZlIHRoZSBgI2AgZnJvbSB0aGUgb3B0aW9ucyBjaHVuayBhdCB0aGUgdG9wLCBzbyB0aGF0IG1lc3NhZ2VzIGFuZCB3YXJuaW5ncyBhcmVuJ3QgcHJpbnRlZC4gSWYgeW91IGFyZSBnZXR0aW5nIGVycm9ycyBpbiB5b3VyIGNvZGUsIGFkZCBgZXJyb3IgPSBUUlVFYCBzbyB0aGF0IHRoZSBmaWxlIGtuaXRzLiBJIHdvdWxkIHJlY29tbWVuZCBub3QgcmVtb3ZpbmcgdGhlIGAjYCB1bnRpbCB5b3UgYXJlIGNvbXBsZXRlbHkgZmluaXNoZWQuDQoNCiMjIFB1dCBpdCBvbiBHaXRIdWIhICAgICAgICANCg0KW0dpdEh1YiByZXBvXShodHRwczovL2dpdGh1Yi5jb20veXpob25nMDYyMC9TVEFULTQ1Ni1Bc3NpZ25tZW50LTIpDQoNCiMjIE1hY2hpbmUgTGVhcm5pbmcgcmV2aWV3IGFuZCBpbnRybyB0byBgdGlkeW1vZGVsc2ANCg0KMS4gUmVhZCBhYm91dCB0aGUgaG90ZWwgYm9va2luZyBkYXRhLCBgaG90ZWxzYCwgb24gdGhlIFtUaWR5IFR1ZXNkYXkgcGFnZV0oaHR0cHM6Ly9naXRodWIuY29tL3Jmb3JkYXRhc2NpZW5jZS90aWR5dHVlc2RheS9ibG9iL21hc3Rlci9kYXRhLzIwMjAvMjAyMC0wMi0xMS9yZWFkbWUubWQpIGl0IGNhbWUgZnJvbS4gVGhlcmUgaXMgYWxzbyBhIGxpbmsgdG8gYW4gYXJ0aWNsZSBmcm9tIHRoZSBvcmlnaW5hbCBhdXRob3JzLiBUaGUgb3V0Y29tZSB3ZSB3aWxsIGJlIHByZWRpY3RpbmcgaXMgY2FsbGVkIGBpc19jYW5jZWxlZGAuIA0KICANCj4gYmFiaWVzLCBpc19yZXBlYXRlZF9ndWVzdCwgcHJldmlvdXNfY2FuY2VsbGF0aW9ucywgYW5kIGRlcG9zaXRfdHlwZSBtaWdodCBiZSBwcmVkaWN0aXZlIHZhcmlhYmxlcy4gV2l0aCBiYWJpZXMsIGl0IGlzIHBvc3NpYmxlIHRvIGhhdmUgYSBsb3Qgb2YgZW1lcmdlbmNpZXMgZ29pbmcgb24sIGFuZCB0aHVzIHBsYW5zIGNvdWxkIGJlIGNoYW5naW5nIGNvbnN0YW50bHkuIEJlaW5nIGEgcmVwZWF0ZWQgZ3Vlc3QgbWF5IGxlYWQgdG8gYSBsb3dlciBsaWtlbGlob29kIG9mIGNhbmNlbGxhdGlvbiBiZWNhdXNlIHRoZXkgaGF2ZSBhbHJlYWR5IGtub3duIGFib3V0IHRoZSBob3RlbC4gcHJldmlvdXNfY2FuY2VsbGF0aW9ucyBpcyBhIGdvb2QgcHJlZGljdGl2ZSB0byB0ZWxsIHRoZSBnZW5lcmFsIGhhYml0IG9mIHRoZSBndWVzdC4gV2l0aCB0aGUgZGVwb3NpdCwgaXQgd291bGQgYmUgbGVzcyBsaWtlbHkgZm9yIGd1ZXN0cyB0byBjYW5jZWwgaW4gb3JkZXIgbm90IHRvIGxvc2UgdGhlaXIgbW9uZXkuDQoNCj4gU29tZSBvZiB0aGUgdmFyaWFibGVzIHdlcmUgZW5naW5lZXJlZCBmcm9tIG90aGVyIHZhcmlhYmxlcyBmcm9tIGRpZmZlcmVudCBkYXRhYmFzZSB0YWJsZXMuIA0KDQo+IFdlIHdpbGwgYmUgYWJsZSB0byBrbm93IHdoaWNoIGFyZSB0aGUgbW9zdCBpbXBvcnRhbnQgdmFyaWFibGVzIHRoYXQgY2FuIHByZWRpY3QgaXNfY2FuY2FsZWQgYW5kIGhvdyB0aGV5IGFmZmVjdCBpdC4NCg0KMi4gQ3JlYXRlIHNvbWUgZXhwbG9yYXRvcnkgcGxvdHMgb3IgdGFibGUgc3VtbWFyaWVzIG9mIHRoZSB2YXJpYWJsZXMgaW4gdGhlIGRhdGFzZXQuIEJlIHN1cmUgdG8gYWxzbyBleGFtaW5lIG1pc3NpbmcgdmFsdWVzIG9yIG90aGVyIGludGVyZXN0aW5nIHZhbHVlcy4gWW91IG1heSB3YW50IHRvIGFkanVzdCB0aGUgYGZpZy53aWR0aGAgYW5kIGBmaWcuaGVpZ2h0YCBpbiB0aGUgY29kZSBjaHVuayBvcHRpb25zLiAgDQoNCmBgYHtyfQ0KaG90ZWxzICU+JSANCiAgc2VsZWN0KHdoZXJlKGlzLm51bWVyaWMpKSAlPiUgDQogIHBpdm90X2xvbmdlcihjb2xzID0gZXZlcnl0aGluZygpLA0KICAgICAgICAgICAgICAgbmFtZXNfdG8gPSAidmFyaWFibGUiLCANCiAgICAgICAgICAgICAgIHZhbHVlc190byA9ICJ2YWx1ZSIpICU+JSANCiAgZ2dwbG90KGFlcyh4ID0gdmFsdWUpKSArDQogIGdlb21faGlzdG9ncmFtKGJpbnMgPSAzMCkgKw0KICBmYWNldF93cmFwKHZhcnModmFyaWFibGUpLCANCiAgICAgICAgICAgICBzY2FsZXMgPSAiZnJlZSIsDQogICAgICAgICAgICAgbnJvdyA9IDUpDQpgYGANCg0KYGBge3J9DQpob3RlbHMgJT4lDQogIG11dGF0ZShhY3Jvc3Mod2hlcmUoaXMuY2hhcmFjdGVyKSwgYXMuZmFjdG9yKSkgJT4lIA0KICBzZWxlY3Qod2hlcmUoaXMuZmFjdG9yKSkgJT4lDQogIHBpdm90X2xvbmdlcihjb2xzID0gZXZlcnl0aGluZygpLA0KICAgICAgICAgICAgICAgbmFtZXNfdG8gPSAidmFyaWFibGUiLA0KICAgICAgICAgICAgICAgdmFsdWVzX3RvID0gInZhbHVlIikgJT4lDQogIGdncGxvdChhZXMoeCA9IHZhbHVlKSkgKw0KICBnZW9tX2JhcigpICsNCiAgZmFjZXRfd3JhcCh2YXJzKHZhcmlhYmxlKSwNCiAgICAgICAgICAgICBzY2FsZXMgPSAiZnJlZSIsDQogICAgICAgICAgICAgbnJvdyA9IDQpDQpgYGANCg0KYGBge3J9DQpob3RlbHMgJT4lIA0KICBhZGRfbl9taXNzKCkgJT4lIA0KICBjb3VudChuX21pc3NfYWxsKQ0KYGBgDQoNCjMuIEZpcnN0LCB3ZSB3aWxsIGRvIGEgY291cGxlIHRoaW5ncyB0byBnZXQgdGhlIGRhdGEgcmVhZHkuIA0KDQpgYGB7cn0NCmhvdGVsc19tb2QgPC0gaG90ZWxzICU+JSANCiAgbXV0YXRlKGlzX2NhbmNlbGVkID0gYXMuZmFjdG9yKGlzX2NhbmNlbGVkKSkgJT4lIA0KICBtdXRhdGUoYWNyb3NzKHdoZXJlKGlzLmNoYXJhY3RlciksIGFzLmZhY3RvcikpICU+JSANCiAgc2VsZWN0KC1hcnJpdmFsX2RhdGVfeWVhciwNCiAgICAgICAgIC1yZXNlcnZhdGlvbl9zdGF0dXMsDQogICAgICAgICAtcmVzZXJ2YXRpb25fc3RhdHVzX2RhdGUpICU+JSANCiAgYWRkX25fbWlzcygpICU+JSANCiAgZmlsdGVyKG5fbWlzc19hbGwgPT0gMCkgJT4lIA0KICBzZWxlY3QoLW5fbWlzc19hbGwpDQoNCnNldC5zZWVkKDQ5NCkNCg0KaG90ZWxfc3BsaXQgPC0gaW5pdGlhbF9zcGxpdChob3RlbHNfbW9kLCBwcm9wID0gLjUsIHN0cmF0YSA9IGlzX2NhbmNlbGVkKQ0KaG90ZWxfdHJhaW5pbmcgPC0gdHJhaW5pbmcoaG90ZWxfc3BsaXQpDQpob3RlbF90ZXN0aW5nIDwtIHRlc3RpbmcoaG90ZWxfc3BsaXQpDQpgYGANCg0KNC4gSW4gdGhpcyBuZXh0IHN0ZXAsIHdlIGFyZSBnb2luZyB0byBkbyB0aGUgcHJlLXByb2Nlc3NpbmcuIFVzdWFsbHksIEkgd29uJ3QgdGVsbCB5b3UgZXhhY3RseSB3aGF0IHRvIGRvIGhlcmUsIGJ1dCBmb3IgeW91ciBmaXJzdCBleGVyY2lzZSwgSSdsbCB0ZWxsIHlvdSB0aGUgc3RlcHMuIA0KDQpgYGB7cn0NCmhvdGVsX3JlY2lwZSA8LSByZWNpcGUoaXNfY2FuY2VsZWQgfiAuLCBkYXRhID0gaG90ZWxfdHJhaW5pbmcpICU+JSANCiAgc3RlcF9tdXRhdGUoaGFzX2NoaWxkID0gYXMuZmFjdG9yKGFzLm51bWVyaWMoY2hpbGRyZW4gPiAwKSksDQogICAgICAgICAgICAgIGhhc19iYWJ5ID0gYXMuZmFjdG9yKGFzLm51bWVyaWMoYmFiaWVzID4gMCkpLA0KICAgICAgICAgICAgICBoYXNfcHJlY2FuY2VsID0gYXMuZmFjdG9yKGFzLm51bWVyaWMocHJldmlvdXNfY2FuY2VsbGF0aW9ucyA+IDApKSwNCiAgICAgICAgICAgICAgaGFzX2FnZW50ID0gYXMuZmFjdG9yKGFzLm51bWVyaWMoYWdlbnQgPT0gJ05VTEwnKSksDQogICAgICAgICAgICAgIGhhc19jb21wYW55ID0gYXMuZmFjdG9yKGFzLm51bWVyaWMoY29tcGFueSA9PSAnTlVMTCcpKSwNCiAgICAgICAgICAgICAgY291bnRyeSA9IGZjdF9sdW1wX24oY291bnRyeSwgNSkpICU+JSANCiAgc3RlcF9ybShjaGlsZHJlbiwNCiAgICAgICAgICBiYWJpZXMsDQogICAgICAgICAgcHJldmlvdXNfY2FuY2VsbGF0aW9ucywNCiAgICAgICAgICBhZ2VudCwNCiAgICAgICAgICBjb21wYW55KSAlPiUgDQogIHN0ZXBfbm9ybWFsaXplKGFsbF9wcmVkaWN0b3JzKCksIA0KICAgICAgICAgICAgICAgICAtYWxsX25vbWluYWwoKSkgJT4lIA0KICBzdGVwX2R1bW15KGFsbF9ub21pbmFsKCksIA0KICAgICAgICAgICAgIC1hbGxfb3V0Y29tZXMoKSkNCmBgYA0KDQpgYGB7cn0NCmhvdGVsX3JlY2lwZSAlPiUgDQogIHByZXAoaG90ZWxfdHJhaW5pbmcpICU+JQ0KICBqdWljZSgpDQpgYGANCg0KNS4gSW4gdGhpcyBzdGVwIHdlIHdpbGwgc2V0IHVwIGEgTEFTU08gbW9kZWwgYW5kIHdvcmtmbG93Lg0KDQo+IExBU1NPIHNocmlua3Mgc29tZSBvZiB0aGUgY29lZmZpY2llbnRzIG9mIHZhcmlhYmxlcyB0byAwIHNvIHRoYXQgd2UgZG9uJ3QgaGF2ZSB0b28gbWFueSBwcmVkaWN0b3JzIGluIHRoZSBtb2RlbC4gSW4gdGhpcyBjYXNlLCB3ZSBoYXZlIGFsbW9zdCAzMCBwcmVkaWN0b3JzLCB3aGljaCBpcyBhIGxvdCwgc28gaXQgd291bGQgYmUgZ3JlYXQgdG8gZ2V0IHJpZCBvZiBzb21lIG9mIHRoZW0gdG8gYXZvaWQgb3ZlcmZpdHRpbmcuDQoNCmBgYHtyfQ0KaG90ZWxfbGFzc29fbW9kIDwtIA0KICBsb2dpc3RpY19yZWcobWl4dHVyZSA9IDEpICU+JSANCiAgc2V0X2VuZ2luZSgiZ2xtbmV0IikgJT4lIA0KICBzZXRfYXJncyhwZW5hbHR5ID0gdHVuZSgpKSAlPiUgDQogIHNldF9tb2RlKCJjbGFzc2lmaWNhdGlvbiIpDQpgYGANCg0KYGBge3J9DQpob3RlbF9sYXNzb193ZiA8LSANCiAgd29ya2Zsb3coKSAlPiUgDQogIGFkZF9yZWNpcGUoaG90ZWxfcmVjaXBlKSAlPiUgDQogIGFkZF9tb2RlbChob3RlbF9sYXNzb19tb2QpDQoNCmhvdGVsX2xhc3NvX3dmDQpgYGANCg0KNi4gSW4gdGhpcyBzdGVwLCB3ZSdsbCB0dW5lIHRoZSBtb2RlbCBhbmQgZml0IHRoZSBtb2RlbCB1c2luZyB0aGUgYmVzdCB0dW5pbmcgcGFyYW1ldGVyIHRvIHRoZSBlbnRpcmUgdHJhaW5pbmcgZGF0YXNldC4NCg0KYGBge3J9DQpzZXQuc2VlZCg0OTQpICMgZm9yIHJlcHJvZHVjaWJpbGl0eQ0KDQpob3RlbF9jdiA8LSB2Zm9sZF9jdihob3RlbF90cmFpbmluZywgdiA9IDUpDQpgYGANCg0KYGBge3J9DQpwZW5hbHR5X2dyaWQgPC0gZ3JpZF9yZWd1bGFyKHBlbmFsdHkoKSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbGV2ZWxzID0gMTApDQpwZW5hbHR5X2dyaWQgDQpgYGANCg0KYGBge3J9DQpob3RlbF9sYXNzb190dW5lIDwtIA0KICBob3RlbF9sYXNzb193ZiAlPiUgDQogIHR1bmVfZ3JpZCgNCiAgICByZXNhbXBsZXMgPSBob3RlbF9jdiwNCiAgICBncmlkID0gcGVuYWx0eV9ncmlkDQogICAgKQ0KDQpob3RlbF9sYXNzb190dW5lDQpgYGANCg0KYGBge3J9DQpob3RlbF9sYXNzb190dW5lICU+JSANCiAgY29sbGVjdF9tZXRyaWNzKCkgJT4lIA0KICBmaWx0ZXIoLm1ldHJpYyA9PSAiYWNjdXJhY3kiKQ0KDQpob3RlbF9sYXNzb190dW5lICU+JSANCiAgY29sbGVjdF9tZXRyaWNzKCkgJT4lIA0KICBmaWx0ZXIoLm1ldHJpYyA9PSAiYWNjdXJhY3kiKSAlPiUgDQogIGdncGxvdChhZXMoeCA9IHBlbmFsdHksIHkgPSBtZWFuKSkgKw0KICBnZW9tX3BvaW50KCkgKw0KICBnZW9tX2xpbmUoKSArDQogIHNjYWxlX3hfbG9nMTAoYnJlYWtzID0gc2NhbGVzOjp0cmFuc19icmVha3MoImxvZzEwIiwgZnVuY3Rpb24oeCkgMTBeeCksDQogICAgICAgICAgICAgICAgbGFiZWxzID0gc2NhbGVzOjp0cmFuc19mb3JtYXQoImxvZzEwIixzY2FsZXM6Om1hdGhfZm9ybWF0KDEwXi54KSkpICsNCiAgbGFicyh4ID0gInBlbmFsdHkiLCB5ID0gImFjY3VyYWN5IikNCmBgYA0KDQpgYGB7cn0NCmJlc3RfcGFyYW0gPC0gaG90ZWxfbGFzc29fdHVuZSAlPiUgDQogIHNlbGVjdF9iZXN0KG1ldHJpYyA9ICJhY2N1cmFjeSIpDQpgYGANCg0KYGBge3J9DQpob3RlbF9sYXNzb19maW5hbF93ZiA8LSBob3RlbF9sYXNzb193ZiAlPiUgDQogIGZpbmFsaXplX3dvcmtmbG93KGJlc3RfcGFyYW0pDQoNCmhvdGVsX2xhc3NvX2ZpbmFsX21vZCA8LSBob3RlbF9sYXNzb19maW5hbF93ZiAlPiUgDQogIGZpdChkYXRhID0gaG90ZWxfdHJhaW5pbmcpDQoNCmhvdGVsX2xhc3NvX2ZpbmFsX21vZCAlPiUgDQogIHB1bGxfd29ya2Zsb3dfZml0KCkgJT4lIA0KICB0aWR5KCkgDQpgYGANCg0KPiBhcnJpdmFsX2RhdGVfbW9udGhfU2VwdGVtYmVyLCBtYXJrZXRfc2VnbWVudF9Hcm91cHMsIG1hcmtldF9zZWdtZW50X1VuZGVmaW5lZCwgZGlzdHJpYnV0aW9uX2NoYW5uZWxfVW5kZWZpbmVkLCAgYW5kIGFzc2lnbmVkX3Jvb21fdHlwZV9MIGhhdmUgY29lZmZpY2llbnRzIG9mIDAuDQoNCjcuIE5vdyB0aGF0IHdlIGhhdmUgYSBtb2RlbCwgbGV0J3MgZXZhbHVhdGUgaXQgYSBiaXQgbW9yZS4gQWxsIHdlIGhhdmUgbG9va2VkIGF0IHNvIGZhciBpcyB0aGUgY3Jvc3MtdmFsaWRhdGVkIGFjY3VyYWN5IGZyb20gdGhlIHByZXZpb3VzIHN0ZXAuIA0KDQpgYGB7cn0NCmhvdGVsX2xhc3NvX2ZpbmFsX21vZCAlPiUgDQogIHB1bGxfd29ya2Zsb3dfZml0KCkgJT4lIA0KICB2aXAoKQ0KYGBgDQoNCj4gcmVzZXJ2ZWRfcm9vbV90eXBlX1AgaXMgdGhlIG1vc3QgaW1wb3J0YW50IHZhcmlhYmxlLCBmb2xsb3dpbmcgYnkgZGVwb3NpdF90eXBlX05vbi5SZWZ1bmQgYW5kIGhhc19wcmVjZW5jZWxfWDEuIEknbSBub3QgdmVyeSBzdXJwcmlzZWQgYmVjYXVzZSByZXNlcnZlZCByb29tIHR5cGUgbWF5IHRlbGwgYXBwcm94aW1hdGVseSBob3cgbWFueSBndWVzdHMgdGhlcmUgYXJlIGFuZCB3aGF0IHRoZWlyIHB1cnBvc2Ugb2YgdGhlIHJlc2VydmF0aW9uIGlzLCB3aGljaCBjb3VsZCBkZXRlcm1pbmUgdGhlIGxpa2lob29kIG9mIGNhbmNlbGxpbmcgdGhlIGV2ZW50Lg0KDQpgYGB7cn0NCmhvdGVsX2xhc3NvX3Rlc3QgPC0gaG90ZWxfbGFzc29fZmluYWxfd2YgJT4lIA0KICBsYXN0X2ZpdChob3RlbF9zcGxpdCkNCg0KaG90ZWxfbGFzc29fdGVzdCAlPiUgDQogIGNvbGxlY3RfbWV0cmljcygpDQoNCnByZWRzIDwtIGNvbGxlY3RfcHJlZGljdGlvbnMoaG90ZWxfbGFzc29fdGVzdCkNCg0KY29uZl9tYXQocHJlZHMsIGlzX2NhbmNlbGVkLCAucHJlZF9jbGFzcykNCmBgYA0KDQo+IFRoZSB0ZXN0IG1ldHJpYyBpcyBzbGlnaHRseSBsb3dlciB0aGFuIHRoZSBjcm9zcy12YWxpZGF0ZWQgb25lLCBidXQgdGhleSBhcmUgcHJldHR5IGNsb3NlIHRvIGVhY2ggb3RoZXIuDQoNCmBgYHtyfQ0KIyBUcnVlIHBvc2l0aXZlDQoxNDMzMy8oMTQzMzMrNzc3NykNCmBgYA0KDQpgYGB7cn0NCiMgVHJ1ZSBuZWdhdGl2ZQ0KMzQxNzkvKDM0MTc5KzM0MDQpDQpgYGANCg0KYGBge3J9DQojIEFjY3VyYWN5DQooMzQxNzkrMTQzMzMpLygzNDE3OSsxNDMzMys3Nzc3KzM0MDQpDQpgYGANCg0KYGBge3J9DQpwcmVkcyAlPiUgDQogIGdncGxvdChhZXMoeCA9IC5wcmVkXzEsIGZpbGwgPSBpc19jYW5jZWxlZCkpICsNCiAgZ2VvbV9kZW5zaXR5KGFscGhhID0gMC41LCBjb2xvciA9IE5BKQ0KYGBgDQoNCmEuIFdoYXQgd291bGQgdGhpcyBncmFwaCBsb29rIGxpa2UgZm9yIGEgbW9kZWwgd2l0aCBhbiBhY2N1cmFjeSB0aGF0IHdhcyBjbG9zZSB0byAxPyAgDQoNCj4gSWYgdGhlIGFjY3VyYWN5IGlzIGNsb3NlIHRvIDEsIHRoZSByZWQgcGFydCB3b3VsZCBtb3N0bHkgaGFzIGEgLnByZWRfMSA8IDAuNSBhbmQgdGhlIGJsdWUgcGFydCB3b3VsZCBtb3N0bHkgaGFzIGEgLnByZWRfMSA+IDAuNS4NCg0KYi4gT3VyIHByZWRpY3Rpb25zIGFyZSBjbGFzc2lmaWVkIGFzIGNhbmNlbGVkIGlmIHRoZWlyIHByZWRpY3RlZCBwcm9iYWJpbGl0eSBvZiBjYW5jZWxpbmcgaXMgZ3JlYXRlciB0aGFuIC41LiBJZiB3ZSB3YW50ZWQgdG8gaGF2ZSBhIGhpZ2ggdHJ1ZSBwb3NpdGl2ZSByYXRlLCBzaG91bGQgd2UgbWFrZSB0aGUgY3V0b2ZmIGZvciBwcmVkaWN0ZWQgYXMgY2FuY2VsZWQgaGlnaGVyIG9yIGxvd2VyIHRoYW4gLjU/DQoNCj4gTG93ZXIgdGhhbiAwLjUuDQoNCmMuIFdoYXQgaGFwcGVucyB0byB0aGUgdHJ1ZSBuZWdhdGl2ZSByYXRlIGlmIHdlIHRyeSB0byBnZXQgYSBoaWdoZXIgdHJ1ZSBwb3NpdGl2ZSByYXRlPyANCg0KPiBJdCB3aWxsIGJlIGxvd2VyLg0KDQo4LiBMZXQncyBzYXkgdGhhdCB0aGlzIG1vZGVsIGlzIGdvaW5nIHRvIGJlIGFwcGxpZWQgdG8gYm9va2luZ3MgMTQgZGF5cyBpbiBhZHZhbmNlIG9mIHRoZWlyIGFycml2YWwgYXQgZWFjaCBob3RlbCwgYW5kIHNvbWVvbmUgd2hvIHdvcmtzIGZvciB0aGUgaG90ZWwgd2lsbCBtYWtlIGEgcGhvbmUgY2FsbCB0byB0aGUgcGVyc29uIHdobyBtYWRlIHRoZSBib29raW5nLiBEdXJpbmcgdGhpcyBwaG9uZSBjYWxsLCB0aGV5IHdpbGwgdHJ5IHRvIGFzc3VyZSB0aGF0IHRoZSBwZXJzb24gd2lsbCBiZSBrZWVwaW5nIHRoZWlyIHJlc2VydmF0aW9uIG9yIHRoYXQgdGhleSB3aWxsIGJlIGNhbmNlbGluZyBpbiB3aGljaCBjYXNlIHRoZXkgY2FuIGRvIHRoYXQgbm93IGFuZCBzdGlsbCBoYXZlIHRpbWUgdG8gZmlsbCB0aGUgcm9vbS4gSG93IHNob3VsZCB0aGUgaG90ZWwgZ28gYWJvdXQgZGVjaWRpbmcgd2hvIHRvIGNhbGw/IEhvdyBjb3VsZCB0aGV5IG1lYXN1cmUgd2hldGhlciBpdCB3YXMgd29ydGggdGhlIGVmZm9ydCB0byBkbyB0aGUgY2FsbGluZz8gQ2FuIHlvdSB0aGluayBvZiBhbm90aGVyIHdheSB0aGV5IG1pZ2h0IHVzZSB0aGUgbW9kZWw/IA0KDQo+IFRoZSBob3RlbCBzaG91bGQgY2FsbCBndWVzdHMgd2l0aCByZXNlcnZlZCByb29tIHR5cGUgUCBiZWNhdXNlIGl0IGlzIHRoZSB2YXJpYWJsZSB0aGF0IGluZmx1ZW5jZXMgdGhlIG91dGNvbWUgdGhlIG1vc3QuIFRvIG1lYXN1cmUgd2hldGhlciBpdCB3YXMgd29ydGggdGhlIGVmZm9ydCwgaXQgY291bGQgYmUgaGVscGZ1bCB0byBsb29rIGF0IG90aGVyIGltcG9ydGFudCB2YXJpYWJsZXMgc3VjaCBhcyBkZXBvc2l0IHR5cGUsIHByZXZpb3VzIGNhbmNlbGxhdGlvbiwgYW5kIGFzc2lnbmVkIHJvb20gdHlwZSB3aGljaCBtYWtlIGd1ZXN0cyBtb3JlIGxpa2VseSB0byBjYW5jZWwuIFRoZXkgY2FuIGFsc28gdXNlIHRoZSBtb2RlbCB0byBpbXByb3ZlIHRoZWlyIHJlc2VydmF0aW9uIHN5c3RlbS4gRm9yIGV4YW1wbGUsIGlmIHRoZXkgbWFrZSBhbGwgZGVwb3NpdHMgbm9uLXJlZnVuZGFibGUgdG8gbG93ZXIgdGhlIGxpa2VsaWhvb2Qgb2YgY2FuY2VsbGF0aW9uLg0KDQo5LiBIb3cgbWlnaHQgeW91IGdvIGFib3V0IHF1ZXN0aW9uaW5nIGFuZCBldmFsdWF0aW5nIHRoZSBtb2RlbCBpbiB0ZXJtcyBvZiBmYWlybmVzcz8gQXJlIHRoZXJlIGFueSBxdWVzdGlvbnMgeW91IHdvdWxkIGxpa2UgdG8gYXNrIG9mIHRoZSBwZW9wbGUgd2hvIGNvbGxlY3RlZCB0aGUgZGF0YT8gDQoNCj4gSSB3b3VsZCBsaWtlIHRvIGxlYXJuIHRoZSBwcm9wb3J0aW9uIG9mIHlvdW5nIGFuZCBvbGQgcGVvcGxlIGFuZCB0aGUgcHJvcG9ydGlvbiBvZiBkaWZmZXJlbnQgZ2VuZGVyL3JhY2UgZ3JvdXBzIHRvIGNoZWNrIGlmIHRoZSBkYXRhIHVuZGVycmVwcmVzZW50IGFueSBncm91cC4gSSB3b3VsZCBhbHNvIGxpa2UgdG8gYXNrIHdoYXQgdGhleSB0aGluayBhcmUgdGhlIG1vc3QgaW1wb3J0YW50IHZhcmlhYmxlcyBpbmZsdWVuY2luZyBjYW5jZWxsYXRpb24gYmVjYXVzZSB0aGVpciBwcmltYXJ5IGtub3dsZWRnZSBtYXkgbGVhZCB0byBhIGJpYXNlZCBjb2xsZWN0aW9uIG9mIGRhdGEuDQoNCiMjIEJpYXMgYW5kIEZhaXJuZXNzDQoNClJlYWQgW0NoYXB0ZXIgMTogVGhlIFBvd2VyIENoYXB0ZXJdKGh0dHBzOi8vZGF0YS1mZW1pbmlzbS5taXRwcmVzcy5taXQuZWR1L3B1Yi92aThvYnhoNy9yZWxlYXNlLzQpIG9mIERhdGEgRmVtaW5pc20gYnkgQ2F0aGVyaW5lIEQnSWduYXppbyBhbmQgTGF1cmVuIEtsZWluLiBXcml0ZSBhIDQtNiBzZW50ZW5jZSBwYXJhZ3JhcGggcmVmbGVjdGluZyBvbiB0aGlzIGNoYXB0ZXIuIEFzIHlvdSByZWZsZWN0LCB5b3UgbWlnaHQgY29uc2lkZXIgcmVzcG9uZGluZyB0byB0aGVzZSBzcGVjaWZpYyBxdWVzdGlvbnMuIFdlIHdpbGwgYWxzbyBoYXZlIGEgZGlzY3Vzc2lvbiBhYm91dCB0aGVzZSBxdWVzdGlvbnMgaW4gY2xhc3Mgb24gVGh1cnNkYXkuDQoNCiogQXQgdGhlIGVuZCBvZiB0aGUgIk1hdHJpeCBvZiBEb21pbmF0aW9uIiBzZWN0aW9uLCB0aGV5IGVuY291cmFnZSB1cyB0byAiYXNrIHVuY29tZm9ydGFibGUgcXVlc3Rpb25zOiB3aG8gaXMgZG9pbmcgdGhlIHdvcmsgb2YgZGF0YSBzY2llbmNlIChhbmQgd2hvIGlzIG5vdCk/IFdob3NlIGdvYWxzIGFyZSBwcmlvcml0aXplZCBpbiBkYXRhIHNjaWVuY2UgKGFuZCB3aG9zZSBhcmUgbm90KT8gQW5kIHdobyBiZW5lZml0cyBmcm9tIGRhdGEgc2NpZW5jZSAoYW5kIHdobyBpcyBlaXRoZXIgb3Zlcmxvb2tlZCBvciBhY3RpdmVseSBoYXJtZWQpPyIgSW4gZ2VuZXJhbCwgaG93IHdvdWxkIHlvdSBhbnN3ZXIgdGhlc2UgcXVlc3Rpb25zPyBBbmQgd2h5IGFyZSB0aGV5IGltcG9ydGFudD8gIA0KDQo+IEkgYmVsaWV2ZSB0aGUgZmllbGQgb2YgZGF0YSBzY2llbmNlIGlzIHN0aWxsIHByZWRvbWluYXRlZCBieSBtZW4sIHdoaWNoIGlzIGFsc28gdGhlIGdyb3VwIHRoYXQgaXMgcHJpb3JpdGl6ZWQuIEJlY2F1c2Ugb2YgaGlnaGVyIHNvY2lvZWNvbm9taWMgc3RhdHVzLCBtZW4gY291bGQgYmUgdGhlIHByaW1hcnkgdGFyZ2V0IG9mIGhpZ2gtdGVjaCBwcm9kdWN0cyBhbmQgY2FuIGJlbmVmaXQgZnJvbSB0aGVzZSB0ZWNobm9sb2dpZXMuIE9uIHRoZSBvdGhlciBoYW5kLCBhbGwgb3RoZXIgZ2VuZGVycyBjb3VsZCBiZSBvdmVybG9va2VkIG9yIGFjdGl2ZWx5IGhhcm1lZC4gSXQgaXMgaW1wb3J0YW50IHRvIHRoaW5rIGFib3V0IHRoZXNlIHF1ZXN0aW9ucyBiZWNhdXNlIHdpdGhvdXQgdGhpbmtpbmcgZGVlcGx5IGFib3V0IHRoZW0sIHBlb3BsZSBtaWdodCBub3QgZXZlbiByZWFsaXplIHRoZSBwcm9ibGVtLiBBbmQgcmVhbGl6aW5nIGEgcHJvYmxlbSBpcyB0aGUgZmlyc3Qgc3RlcCB0byBzb2x2ZSBpdC4gSWYgdGhlIHByb2JsZW0gaXMgbm90IG5vdGljZWQgYW5kIGNvcnJlY3RlZCBpbiB0aW1lLCB0aGUgZGF0YSBjb2xsZWN0ZWQgbGVhZHMgdG8gYSBiaWFzZWQgcHJvZHVjdCB3aGljaCB0aGVuIGxlYWRzIHRvIGEgbW9yZSBiaWFzZWQgY29sbGVjdGlvbiBvZiBkYXRhLiBTdWNoIGEgY3ljbGUgY2FuIGdvIG9uIGFuZCBvbiBhbmQgZnVydGhlciBoYXJtIHRoZSBtaW5vcml0aXplZCBncm91cHMuDQoNCiogQ2FuIHlvdSB0aGluayBvZiBhbnkgZXhhbXBsZXMgb2YgbWlzc2luZyBkYXRhc2V0cywgbGlrZSB0aG9zZSBkZXNjcmliZWQgaW4gdGhlICJEYXRhIFNjaWVuY2UgZm9yIFdob20/IiBzZWN0aW9uPyBPciB3YXMgdGhlcmUgYW4gZXhhbXBsZSB0aGVyZSB0aGF0IHN1cnByaXNlZCB5b3U/ICANCg0KPiBJIHdhcyBzdXJwcmlzZWQgdGhhdCBNb2JpbGl0eSBmb3Igb2xkZXIgYWR1bHRzIHdpdGggcGh5c2ljYWwgZGlzYWJpbGl0aWVzIG9yIGNvZ25pdGl2ZSBpbXBhaXJtZW50cyBpcyBtaXNzaW5nLiBQaHlzaWNhbCBkaXNhYmlsaXRpZXMgYW5kIGNvZ25pdGl2ZSBpbXBhaXJtZW50cywgSSBiZWxpZXZlLCBhcmUgc3ltcHRvbXMgYSBsYXJnZSBncm91cCBvZiBlbGRlcmx5IHBlb3BsZSBleHBlcmllbmNlcy4gV2l0aCBpbmNyZWFzaW5nIGF0dGVudGlvbiB0byBwdWJsaWMgaGVhbHRoLCBpdCBpcyBzdXJwcmlzaW5nIHRoYXQgdGhlcmUgaXMgbm8gZGF0YXNldCBjb3ZlcmluZyB0aGlzIGFyZWEuIFRoaXMgY291bGQgYWxzbyBpbmRpY2F0ZSB0aGUgb2xkZXIgYWR1bHRzIGFyZSBhbiBvdmVybG9va2VkIGFuZCBtaW5vcml0aXplZCBncm91cCwgd2hpY2ggbWFrZXMgbm8gc2Vuc2UgYmVjYXVzZSB0aGV5IGFyZSB0aGUgZ3JvdXAgbW9yZSBsaWtlbHkgdG8gZXhwZXJpZW5jZSB0aGUgbW9zdCBoZWFsdGgtcmVsYXRlZCBpc3N1ZXMuDQoNCiogSG93IGRpZCB0aGUgZXhhbXBsZXMgaW4gdGhlICJEYXRhIFNjaWVuY2Ugd2l0aCBXaG9zZSBJbnRlcmVzdHMgYW5kIEdvYWxzPyIgc2VjdGlvbiBtYWtlIHlvdSBmZWVsPyBXaGF0IHJlc3BvbnNpYmlsaXR5IGRvIGNvbXBhbmllcyBoYXZlIHRvIHByZXZlbnQgdGhlc2UgdGhpbmdzIGZyb20gb2NjdXJyaW5nPyBXaG8gaXMgdG8gYmxhbWU/DQoNCj4gSSBmZWVsIHNhZCBiZWNhdXNlIEkgYWx3YXlzIHNlZSBkYXRhIGFzIHNvbWV0aGluZyBtb3N0bHkgb2JqZWN0aXZlIGFuZCBhcyBhIHRvb2wgdG8gcHJvdmlkZSBldmlkZW5jZS1iYXNlZCBleHBsYW5hdGlvbnMvY29uY2x1c2lvbnMuIFRoZXNlIGV4YW1wbGVzIHNob3cgdGhhdCBJIGFtIHdyb25nIGFuZCBJIGFtIHVwc2V0IHRvIHNlZSB0aGF0IGRhdGEgaXMgbm90IGJyaW5naW5nIHRoZSBnb29kIGFzIGl0IHN1cHBvc2VzIHRvIGJlLiBDb21wYW5pZXMgc2hvdWxkIGJlIGV4dHJhIGNhcmVmdWwgYWJvdXQgdGhlIHNvdXJjZSBvZiB0aGVpciBkYXRhLCBob3cgdGhlaXIgZGF0YSB3YXMgY29sbGVjdGVkLCBhbmQgd2hldGhlciB0aGVpciBkYXRhIHVuZGVyIG9yIG92ZXJyZXByZXNlbnQgYW55IGdyb3Vwcy4gQW55b25lIGluIHRoZSBwcm9jZXNzIGNvdWxkIHBvdGVudGlhbGx5IGRvIHNvbWV0aGluZyBpbmNvcnJlY3RseSwgYnV0IG1vcmUgaW1wb3J0YW50IGlzIHRvIGxldCBwZW9wbGUgcmVhbGl6ZSBwb3RlbnRpYWwgcHJvYmxlbXMgYXNzb2NpYXRlZCB3aXRoIHRoZSB1c2FnZSBvZiBkYXRhLg==